[0fc53f]: / Notebook / Week 1 / load_data.ipynb

Download this file

2050 lines (2049 with data), 81.4 kB

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Introduction about dataset"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<p><b>Intracranial hemorrhage🧠 (ICH)</b> is caused by bleeding within the brain tissue itself — a life-threatening type of stroke. A stroke occurs when the brain is deprived of oxygen and blood supply. ICH is most commonly caused by hypertension, arteriovenous malformations, or head trauma. Treatment focuses on stopping the bleeding, removing the blood clot (hematoma), and relieving the pressure on the brain.</p>\n",
    "<br/><br/>\n",
    "<p><b>Diagnosis</b> requires an urgent procedure. When a patient shows acute neurological symptoms such as severe headache or loss of consciousness, highly trained specialists review medical images of the patient’s cranium to look for the presence, location and type of hemorrhage. The process is complicated and often time consuming.</p>\n",
    "<br/><br/>\n",
    "<p>The current clinical protocol to diagnose Intracranial hemorrhage🧠 ICH is examining Computerized Tomography (CT) scans by radiologists to detect ICH and localize its regions. However, this process relies heavily on the availability of an experienced radiologist.CT images are examined by an expert radiologist to determine whether ICH has occurred and if so, detect its type and region. However, this diagnosis process relies on the availability of a subspecialty-trained neuroradiologist, and as a result, could be time inefficient and even inaccurate, especially in remote areas where specialized care is scarce.</p>\n",
    "<br/><br/>\n",
    "<p>In Recent years ,the Advancement in <b>Deep learning</b> has enable us to solve various problem, even in some cases it shows us better results than humans.we will try to solve Intracranical hemorrhage detection and segmentation using CT scan dataset of brain which is annoted by expert radiologists. </p>\n",
    "<p>The challenge is to build an algorithm to detect acute intracranial hemorrhage and its subtypes.</p>\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    ">Intraparenchymal hemorrhage is blood that is located completely within the brain itself; intraventricular or subarachnoid hemorrhage is blood that has leaked into the spaces of the brain that normally contain cerebrospinal fluid (the ventricles or subarachnoid cisterns). Extra-axial hemorrhages are blood that collects in the tissue coverings that surround the brain (e.g. subdural or epidural subtypes). ee figure.) Patients may exhibit more than one type of cerebral hemorrhage, which c may appear on the same image. While small hemorrhages are less morbid than large hemorrhages typically, even a small hemorrhage can lead to death because it is an indicator of another type of serious abnormality (e.g. cerebral aneurysm).\n",
    ">\n",
    "> #### There are four types of ICH:\n",
    ">    * **Intraparenchymal hemorrhage**\n",
    ">    * **Epidural hemorrhage**\n",
    ">    * **Subdural hemorrhage**\n",
    ">    * **Subarachnoid hemorrhage**\n",
    ">    * **intraventricular hemorrhage**\n",
    ">\n",
    "> one patient can exibits more than one type of hemorrhage"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![file](https://user-images.githubusercontent.com/58046531/89164136-4eac1d00-d594-11ea-9408-6d271518b3a7.png)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This datset have six classes \n",
    "1. any - any of five class of hemorrhage is present or not in patient\n",
    "2. epidural\n",
    "3. intraparenchymal\n",
    "4. intraventricular \n",
    "5. subarachnoid\n",
    "6. subdural\n",
    "\n",
    "It is possible that one patient have more than type of hemorrhage."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "base_url = '~/kaggle/rsna-intracranial-hemorrhage-detection/'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "TRAIN_DIR = '/home/ubuntu/kaggle/rsna-intracranial-hemorrhage-detection/stage_2_train'\n",
    "TEST_DIR = '/home/ubuntu/kaggle/rsna-intracranial-hemorrhage-detection/stage_2_test'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import pandas as pd\n",
    "import swifter\n",
    "import numpy as np\n",
    "from tqdm import *\n",
    "import re\n",
    "import seaborn as sns\n",
    "import pydicom\n",
    "import joblib"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Requirement already satisfied: pydicom in /home/ubuntu/anaconda3/envs/tensorflow2_p36/lib/python3.6/site-packages (2.0.0)\n",
      "Note: you may need to restart the kernel to use updated packages.\n"
     ]
    }
   ],
   "source": [
    "pip install pydicom"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "752803\r\n"
     ]
    }
   ],
   "source": [
    "! ls {TRAIN_DIR} | wc -l\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "121232\r\n"
     ]
    }
   ],
   "source": [
    "! ls {TEST_DIR} | wc -l"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ID_000012eaf.dcm\n",
      "ID_000039fa0.dcm\n",
      "ID_00005679d.dcm\n",
      "ID_00008ce3c.dcm\n",
      "ID_0000950d7.dcm\n",
      "ls: write error: Broken pipe\n"
     ]
    }
   ],
   "source": [
    "! ls {TRAIN_DIR} | head -n 5"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(4516842, 2)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>ID</th>\n",
       "      <th>Label</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>ID_12cadc6af_epidural</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>ID_12cadc6af_intraparenchymal</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>ID_12cadc6af_intraventricular</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>ID_12cadc6af_subarachnoid</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>ID_12cadc6af_subdural</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>ID_12cadc6af_any</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>ID_38fd7baa0_epidural</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>ID_38fd7baa0_intraparenchymal</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>ID_38fd7baa0_intraventricular</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>ID_38fd7baa0_subarachnoid</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                              ID  Label\n",
       "0          ID_12cadc6af_epidural      0\n",
       "1  ID_12cadc6af_intraparenchymal      0\n",
       "2  ID_12cadc6af_intraventricular      0\n",
       "3      ID_12cadc6af_subarachnoid      0\n",
       "4          ID_12cadc6af_subdural      0\n",
       "5               ID_12cadc6af_any      0\n",
       "6          ID_38fd7baa0_epidural      0\n",
       "7  ID_38fd7baa0_intraparenchymal      0\n",
       "8  ID_38fd7baa0_intraventricular      0\n",
       "9      ID_38fd7baa0_subarachnoid      0"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train_df = pd.read_csv(base_url+'stage_2_train.csv')\n",
    "print(train_df.shape)\n",
    "train_df.head(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(4516842, 3)\n"
     ]
    }
   ],
   "source": [
    "train_df[['ID', 'Subtype']] = train_df['ID'].str.rsplit(pat='_', n=1, expand=True)\n",
    "print(train_df.shape)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here if we look then we find that each image have output for each class as True(1) or False(0) mean single image have six duplicate image.so we will convert them into one_hot_encoder and then single image will have single row."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>ID</th>\n",
       "      <th>Label</th>\n",
       "      <th>Subtype</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>4516836</th>\n",
       "      <td>ID_4a85a3a3f</td>\n",
       "      <td>0</td>\n",
       "      <td>epidural</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4516837</th>\n",
       "      <td>ID_4a85a3a3f</td>\n",
       "      <td>0</td>\n",
       "      <td>intraparenchymal</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4516838</th>\n",
       "      <td>ID_4a85a3a3f</td>\n",
       "      <td>0</td>\n",
       "      <td>intraventricular</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4516839</th>\n",
       "      <td>ID_4a85a3a3f</td>\n",
       "      <td>0</td>\n",
       "      <td>subarachnoid</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4516840</th>\n",
       "      <td>ID_4a85a3a3f</td>\n",
       "      <td>0</td>\n",
       "      <td>subdural</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4516841</th>\n",
       "      <td>ID_4a85a3a3f</td>\n",
       "      <td>0</td>\n",
       "      <td>any</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                   ID  Label           Subtype\n",
       "4516836  ID_4a85a3a3f      0          epidural\n",
       "4516837  ID_4a85a3a3f      0  intraparenchymal\n",
       "4516838  ID_4a85a3a3f      0  intraventricular\n",
       "4516839  ID_4a85a3a3f      0      subarachnoid\n",
       "4516840  ID_4a85a3a3f      0          subdural\n",
       "4516841  ID_4a85a3a3f      0               any"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train_df.tail(6)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [],
   "source": [
    "def fix_id(img_id, img_dir=TRAIN_DIR):\n",
    "    if not re.match(r'ID_[a-z0-9]+', img_id):\n",
    "        sop = re.search(r'[a-z0-9]+', img_id)\n",
    "        if sop:\n",
    "            img_id_new = f'ID_{sop[0]}'\n",
    "            return img_id_new\n",
    "        else:\n",
    "            print(img_id)\n",
    "    return img_id"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0          ID_12cadc6af\n",
       "1          ID_12cadc6af\n",
       "2          ID_12cadc6af\n",
       "3          ID_12cadc6af\n",
       "4          ID_12cadc6af\n",
       "               ...     \n",
       "4516837    ID_4a85a3a3f\n",
       "4516838    ID_4a85a3a3f\n",
       "4516839    ID_4a85a3a3f\n",
       "4516840    ID_4a85a3a3f\n",
       "4516841    ID_4a85a3a3f\n",
       "Name: ID, Length: 4516842, dtype: object"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train_df['ID'].apply(fix_id)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(752803, 7)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th>ID</th>\n",
       "      <th colspan=\"6\" halign=\"left\">Label</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Subtype</th>\n",
       "      <th></th>\n",
       "      <th>any</th>\n",
       "      <th>epidural</th>\n",
       "      <th>intraparenchymal</th>\n",
       "      <th>intraventricular</th>\n",
       "      <th>subarachnoid</th>\n",
       "      <th>subdural</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>ID_000012eaf</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>ID_000039fa0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>ID_00005679d</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>ID_00008ce3c</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>ID_0000950d7</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                   ID Label                                             \\\n",
       "Subtype                 any epidural intraparenchymal intraventricular   \n",
       "0        ID_000012eaf     0        0                0                0   \n",
       "1        ID_000039fa0     0        0                0                0   \n",
       "2        ID_00005679d     0        0                0                0   \n",
       "3        ID_00008ce3c     0        0                0                0   \n",
       "4        ID_0000950d7     0        0                0                0   \n",
       "\n",
       "                               \n",
       "Subtype subarachnoid subdural  \n",
       "0                  0        0  \n",
       "1                  0        0  \n",
       "2                  0        0  \n",
       "3                  0        0  \n",
       "4                  0        0  "
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train_new = train_df.pivot_table(index='ID', columns='Subtype').reset_index()\n",
    "print(train_new.shape)\n",
    "train_new.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Subtype\n",
      "any                 107933\n",
      "epidural              3145\n",
      "intraparenchymal     36118\n",
      "intraventricular     26205\n",
      "subarachnoid         35675\n",
      "subdural             47166\n",
      "dtype: int64\n"
     ]
    }
   ],
   "source": [
    "subtype_ct = train_new['Label'].sum(axis=0)\n",
    "print(subtype_ct)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Distribution of each type of Hemorrhage"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "sns.barplot(x=subtype_ct.values, y=subtype_ct.index);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [],
   "source": [
    "def id_to_filepath(img_id, img_dir=TRAIN_DIR):\n",
    "    filepath = f'{img_dir}/{img_id}.dcm' # pydicom doesn't play nice with Path objects\n",
    "    if os.path.exists(filepath):\n",
    "        return filepath\n",
    "    else:\n",
    "        return 'DNE'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th>ID</th>\n",
       "      <th colspan=\"6\" halign=\"left\">Label</th>\n",
       "      <th>filepath</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Subtype</th>\n",
       "      <th></th>\n",
       "      <th>any</th>\n",
       "      <th>epidural</th>\n",
       "      <th>intraparenchymal</th>\n",
       "      <th>intraventricular</th>\n",
       "      <th>subarachnoid</th>\n",
       "      <th>subdural</th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>ID_000012eaf</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>ID_000039fa0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>ID_00005679d</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>ID_00008ce3c</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>ID_0000950d7</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                   ID Label                                             \\\n",
       "Subtype                 any epidural intraparenchymal intraventricular   \n",
       "0        ID_000012eaf     0        0                0                0   \n",
       "1        ID_000039fa0     0        0                0                0   \n",
       "2        ID_00005679d     0        0                0                0   \n",
       "3        ID_00008ce3c     0        0                0                0   \n",
       "4        ID_0000950d7     0        0                0                0   \n",
       "\n",
       "                               \\\n",
       "Subtype subarachnoid subdural   \n",
       "0                  0        0   \n",
       "1                  0        0   \n",
       "2                  0        0   \n",
       "3                  0        0   \n",
       "4                  0        0   \n",
       "\n",
       "                                                  filepath  \n",
       "Subtype                                                     \n",
       "0        /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  \n",
       "1        /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  \n",
       "2        /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  \n",
       "3        /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  \n",
       "4        /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  "
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train_new['filepath'] = train_new['ID'].apply(id_to_filepath)\n",
    "train_new.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 74,
   "metadata": {},
   "outputs": [],
   "source": [
    "def get_patient_data(filepath):\n",
    "    if filepath != 'DNE':\n",
    "        dcm_data = pydicom.dcmread(filepath, stop_before_pixels=True)\n",
    "        return dcm_data.PatientID, dcm_data.StudyInstanceUID, dcm_data.SeriesInstanceUID"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 75,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "100%|ā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆ| 752803/752803 [17:31<00:00, 716.03it/s] \n"
     ]
    }
   ],
   "source": [
    "tqdm.pandas()\n",
    "train_new['PatientID'], train_new['StudyID'], train_new['SeriesID'] = zip(*train_new['filepath'].progress_apply(get_patient_data))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 76,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th>ID</th>\n",
       "      <th colspan=\"6\" halign=\"left\">Label</th>\n",
       "      <th>filepath</th>\n",
       "      <th>PatientID</th>\n",
       "      <th>StudyID</th>\n",
       "      <th>SeriesID</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Subtype</th>\n",
       "      <th></th>\n",
       "      <th>any</th>\n",
       "      <th>epidural</th>\n",
       "      <th>intraparenchymal</th>\n",
       "      <th>intraventricular</th>\n",
       "      <th>subarachnoid</th>\n",
       "      <th>subdural</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>ID_000012eaf</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_f15c0eee</td>\n",
       "      <td>ID_30ea2b02d4</td>\n",
       "      <td>ID_0ab5820b2a</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>ID_000039fa0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_eeaf99e7</td>\n",
       "      <td>ID_134d398b61</td>\n",
       "      <td>ID_5f8484c3e0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>ID_00005679d</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_18f2d431</td>\n",
       "      <td>ID_b5c26cda09</td>\n",
       "      <td>ID_203cd6ec46</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>ID_00008ce3c</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_ce8a3cd2</td>\n",
       "      <td>ID_974735bf79</td>\n",
       "      <td>ID_3780d48b28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>ID_0000950d7</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>ID_0000aee4b</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_ce5f0b6c</td>\n",
       "      <td>ID_9aad90e421</td>\n",
       "      <td>ID_1e59488a44</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>ID_0000ca2f6</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_8c5a14af</td>\n",
       "      <td>ID_a84b7a0dcd</td>\n",
       "      <td>ID_d6ba679446</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>ID_0000f1657</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_df70c823</td>\n",
       "      <td>ID_04ef429610</td>\n",
       "      <td>ID_245e16180c</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>ID_000178e76</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_462abff7</td>\n",
       "      <td>ID_4fef99f0df</td>\n",
       "      <td>ID_72952d87fa</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>ID_00019828f</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_fc08e4cf</td>\n",
       "      <td>ID_ade653597d</td>\n",
       "      <td>ID_c0d8754a07</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                   ID Label                                             \\\n",
       "Subtype                 any epidural intraparenchymal intraventricular   \n",
       "0        ID_000012eaf     0        0                0                0   \n",
       "1        ID_000039fa0     0        0                0                0   \n",
       "2        ID_00005679d     0        0                0                0   \n",
       "3        ID_00008ce3c     0        0                0                0   \n",
       "4        ID_0000950d7     0        0                0                0   \n",
       "5        ID_0000aee4b     0        0                0                0   \n",
       "6        ID_0000ca2f6     0        0                0                0   \n",
       "7        ID_0000f1657     0        0                0                0   \n",
       "8        ID_000178e76     0        0                0                0   \n",
       "9        ID_00019828f     0        0                0                0   \n",
       "\n",
       "                               \\\n",
       "Subtype subarachnoid subdural   \n",
       "0                  0        0   \n",
       "1                  0        0   \n",
       "2                  0        0   \n",
       "3                  0        0   \n",
       "4                  0        0   \n",
       "5                  0        0   \n",
       "6                  0        0   \n",
       "7                  0        0   \n",
       "8                  0        0   \n",
       "9                  0        0   \n",
       "\n",
       "                                                  filepath    PatientID  \\\n",
       "Subtype                                                                   \n",
       "0        /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_f15c0eee   \n",
       "1        /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_eeaf99e7   \n",
       "2        /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_18f2d431   \n",
       "3        /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_ce8a3cd2   \n",
       "4        /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "5        /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_ce5f0b6c   \n",
       "6        /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_8c5a14af   \n",
       "7        /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_df70c823   \n",
       "8        /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_462abff7   \n",
       "9        /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_fc08e4cf   \n",
       "\n",
       "               StudyID       SeriesID  \n",
       "Subtype                                \n",
       "0        ID_30ea2b02d4  ID_0ab5820b2a  \n",
       "1        ID_134d398b61  ID_5f8484c3e0  \n",
       "2        ID_b5c26cda09  ID_203cd6ec46  \n",
       "3        ID_974735bf79  ID_3780d48b28  \n",
       "4        ID_8881b1c4b1  ID_84296c3845  \n",
       "5        ID_9aad90e421  ID_1e59488a44  \n",
       "6        ID_a84b7a0dcd  ID_d6ba679446  \n",
       "7        ID_04ef429610  ID_245e16180c  \n",
       "8        ID_4fef99f0df  ID_72952d87fa  \n",
       "9        ID_ade653597d  ID_c0d8754a07  "
      ]
     },
     "execution_count": 76,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train_new.head(10)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 77,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "752803\n",
      "18938\n",
      "21744\n",
      "21744\n"
     ]
    }
   ],
   "source": [
    "print(train_new.shape[0])\n",
    "print(len(train_new['PatientID'].unique()))\n",
    "print(len(train_new['StudyID'].unique()))\n",
    "print(len(train_new['SeriesID'].unique()))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 78,
   "metadata": {},
   "outputs": [],
   "source": [
    "train_new.to_csv('train_new')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 80,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th>ID</th>\n",
       "      <th colspan=\"6\" halign=\"left\">Label</th>\n",
       "      <th>filepath</th>\n",
       "      <th>PatientID</th>\n",
       "      <th>StudyID</th>\n",
       "      <th>SeriesID</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Subtype</th>\n",
       "      <th></th>\n",
       "      <th>any</th>\n",
       "      <th>epidural</th>\n",
       "      <th>intraparenchymal</th>\n",
       "      <th>intraventricular</th>\n",
       "      <th>subarachnoid</th>\n",
       "      <th>subdural</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>ID_000012eaf</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_f15c0eee</td>\n",
       "      <td>ID_30ea2b02d4</td>\n",
       "      <td>ID_0ab5820b2a</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>ID_000039fa0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_eeaf99e7</td>\n",
       "      <td>ID_134d398b61</td>\n",
       "      <td>ID_5f8484c3e0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>ID_00005679d</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_18f2d431</td>\n",
       "      <td>ID_b5c26cda09</td>\n",
       "      <td>ID_203cd6ec46</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>ID_00008ce3c</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_ce8a3cd2</td>\n",
       "      <td>ID_974735bf79</td>\n",
       "      <td>ID_3780d48b28</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>ID_0000950d7</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                   ID Label                                             \\\n",
       "Subtype                 any epidural intraparenchymal intraventricular   \n",
       "0        ID_000012eaf     0        0                0                0   \n",
       "1        ID_000039fa0     0        0                0                0   \n",
       "2        ID_00005679d     0        0                0                0   \n",
       "3        ID_00008ce3c     0        0                0                0   \n",
       "4        ID_0000950d7     0        0                0                0   \n",
       "\n",
       "                               \\\n",
       "Subtype subarachnoid subdural   \n",
       "0                  0        0   \n",
       "1                  0        0   \n",
       "2                  0        0   \n",
       "3                  0        0   \n",
       "4                  0        0   \n",
       "\n",
       "                                                  filepath    PatientID  \\\n",
       "Subtype                                                                   \n",
       "0        /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_f15c0eee   \n",
       "1        /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_eeaf99e7   \n",
       "2        /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_18f2d431   \n",
       "3        /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_ce8a3cd2   \n",
       "4        /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "\n",
       "               StudyID       SeriesID  \n",
       "Subtype                                \n",
       "0        ID_30ea2b02d4  ID_0ab5820b2a  \n",
       "1        ID_134d398b61  ID_5f8484c3e0  \n",
       "2        ID_b5c26cda09  ID_203cd6ec46  \n",
       "3        ID_974735bf79  ID_3780d48b28  \n",
       "4        ID_8881b1c4b1  ID_84296c3845  "
      ]
     },
     "execution_count": 80,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train_new.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 105,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead tr th {\n",
       "        text-align: left;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th></th>\n",
       "      <th>ID</th>\n",
       "      <th colspan=\"6\" halign=\"left\">Label</th>\n",
       "      <th>filepath</th>\n",
       "      <th>PatientID</th>\n",
       "      <th>StudyID</th>\n",
       "      <th>SeriesID</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Subtype</th>\n",
       "      <th></th>\n",
       "      <th>any</th>\n",
       "      <th>epidural</th>\n",
       "      <th>intraparenchymal</th>\n",
       "      <th>intraventricular</th>\n",
       "      <th>subarachnoid</th>\n",
       "      <th>subdural</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>ID_0000950d7</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>39014</th>\n",
       "      <td>ID_0d428e6ca</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>41787</th>\n",
       "      <td>ID_0e320ef83</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>55957</th>\n",
       "      <td>ID_12fa73df7</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>62110</th>\n",
       "      <td>ID_15089384e</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>78497</th>\n",
       "      <td>ID_1aa35c21e</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>95072</th>\n",
       "      <td>ID_204e3c67f</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>95086</th>\n",
       "      <td>ID_204f882af</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>116802</th>\n",
       "      <td>ID_27ba4a0ed</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>168054</th>\n",
       "      <td>ID_39129ab61</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>210093</th>\n",
       "      <td>ID_4763efbcd</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>245270</th>\n",
       "      <td>ID_533fbdf73</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>277098</th>\n",
       "      <td>ID_5df494f62</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>347634</th>\n",
       "      <td>ID_75fe135a4</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>369043</th>\n",
       "      <td>ID_7d436ad0f</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>377907</th>\n",
       "      <td>ID_803e590cf</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>394664</th>\n",
       "      <td>ID_85f4ec603</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>423116</th>\n",
       "      <td>ID_8f9695aef</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>432554</th>\n",
       "      <td>ID_92cafa5f6</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>436494</th>\n",
       "      <td>ID_941d491aa</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>467763</th>\n",
       "      <td>ID_9ea9667e9</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>480805</th>\n",
       "      <td>ID_a31ccb407</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>501610</th>\n",
       "      <td>ID_aa4ce8ca8</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>512286</th>\n",
       "      <td>ID_ade7354a7</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>514805</th>\n",
       "      <td>ID_aec1177e4</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>521980</th>\n",
       "      <td>ID_b1281c12f</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>550282</th>\n",
       "      <td>ID_bad859f87</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>632620</th>\n",
       "      <td>ID_d6fef64ad</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>639962</th>\n",
       "      <td>ID_d97ba5b0d</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>642046</th>\n",
       "      <td>ID_da36abbca</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>659382</th>\n",
       "      <td>ID_e024df4d1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>671395</th>\n",
       "      <td>ID_e43f11a72</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>717566</th>\n",
       "      <td>ID_f40ac6f95</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>731883</th>\n",
       "      <td>ID_f8e0c635e</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>740708</th>\n",
       "      <td>ID_fbe090828</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>747831</th>\n",
       "      <td>ID_fe506a641</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>/home/ubuntu/kaggle/rsna-intracranial-hemorrha...</td>\n",
       "      <td>ID_d278c67b</td>\n",
       "      <td>ID_8881b1c4b1</td>\n",
       "      <td>ID_84296c3845</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                   ID Label                                             \\\n",
       "Subtype                 any epidural intraparenchymal intraventricular   \n",
       "4        ID_0000950d7     0        0                0                0   \n",
       "39014    ID_0d428e6ca     0        0                0                0   \n",
       "41787    ID_0e320ef83     0        0                0                0   \n",
       "55957    ID_12fa73df7     0        0                0                0   \n",
       "62110    ID_15089384e     0        0                0                0   \n",
       "78497    ID_1aa35c21e     0        0                0                0   \n",
       "95072    ID_204e3c67f     0        0                0                0   \n",
       "95086    ID_204f882af     0        0                0                0   \n",
       "116802   ID_27ba4a0ed     1        0                1                0   \n",
       "168054   ID_39129ab61     0        0                0                0   \n",
       "210093   ID_4763efbcd     0        0                0                0   \n",
       "245270   ID_533fbdf73     0        0                0                0   \n",
       "277098   ID_5df494f62     0        0                0                0   \n",
       "347634   ID_75fe135a4     0        0                0                0   \n",
       "369043   ID_7d436ad0f     0        0                0                0   \n",
       "377907   ID_803e590cf     0        0                0                0   \n",
       "394664   ID_85f4ec603     0        0                0                0   \n",
       "423116   ID_8f9695aef     0        0                0                0   \n",
       "432554   ID_92cafa5f6     0        0                0                0   \n",
       "436494   ID_941d491aa     0        0                0                0   \n",
       "467763   ID_9ea9667e9     0        0                0                0   \n",
       "480805   ID_a31ccb407     0        0                0                0   \n",
       "501610   ID_aa4ce8ca8     0        0                0                0   \n",
       "512286   ID_ade7354a7     0        0                0                0   \n",
       "514805   ID_aec1177e4     0        0                0                0   \n",
       "521980   ID_b1281c12f     0        0                0                0   \n",
       "550282   ID_bad859f87     0        0                0                0   \n",
       "632620   ID_d6fef64ad     1        0                1                0   \n",
       "639962   ID_d97ba5b0d     0        0                0                0   \n",
       "642046   ID_da36abbca     0        0                0                0   \n",
       "659382   ID_e024df4d1     0        0                0                0   \n",
       "671395   ID_e43f11a72     0        0                0                0   \n",
       "717566   ID_f40ac6f95     1        0                1                0   \n",
       "731883   ID_f8e0c635e     0        0                0                0   \n",
       "740708   ID_fbe090828     0        0                0                0   \n",
       "747831   ID_fe506a641     0        0                0                0   \n",
       "\n",
       "                               \\\n",
       "Subtype subarachnoid subdural   \n",
       "4                  0        0   \n",
       "39014              0        0   \n",
       "41787              0        0   \n",
       "55957              0        0   \n",
       "62110              0        0   \n",
       "78497              0        0   \n",
       "95072              0        0   \n",
       "95086              0        0   \n",
       "116802             0        0   \n",
       "168054             0        0   \n",
       "210093             0        0   \n",
       "245270             0        0   \n",
       "277098             0        0   \n",
       "347634             0        0   \n",
       "369043             0        0   \n",
       "377907             0        0   \n",
       "394664             0        0   \n",
       "423116             0        0   \n",
       "432554             0        0   \n",
       "436494             0        0   \n",
       "467763             0        0   \n",
       "480805             0        0   \n",
       "501610             0        0   \n",
       "512286             0        0   \n",
       "514805             0        0   \n",
       "521980             0        0   \n",
       "550282             0        0   \n",
       "632620             0        0   \n",
       "639962             0        0   \n",
       "642046             0        0   \n",
       "659382             0        0   \n",
       "671395             0        0   \n",
       "717566             0        0   \n",
       "731883             0        0   \n",
       "740708             0        0   \n",
       "747831             0        0   \n",
       "\n",
       "                                                  filepath    PatientID  \\\n",
       "Subtype                                                                   \n",
       "4        /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "39014    /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "41787    /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "55957    /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "62110    /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "78497    /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "95072    /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "95086    /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "116802   /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "168054   /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "210093   /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "245270   /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "277098   /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "347634   /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "369043   /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "377907   /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "394664   /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "423116   /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "432554   /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "436494   /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "467763   /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "480805   /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "501610   /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "512286   /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "514805   /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "521980   /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "550282   /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "632620   /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "639962   /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "642046   /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "659382   /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "671395   /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "717566   /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "731883   /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "740708   /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "747831   /home/ubuntu/kaggle/rsna-intracranial-hemorrha...  ID_d278c67b   \n",
       "\n",
       "               StudyID       SeriesID  \n",
       "Subtype                                \n",
       "4        ID_8881b1c4b1  ID_84296c3845  \n",
       "39014    ID_8881b1c4b1  ID_84296c3845  \n",
       "41787    ID_8881b1c4b1  ID_84296c3845  \n",
       "55957    ID_8881b1c4b1  ID_84296c3845  \n",
       "62110    ID_8881b1c4b1  ID_84296c3845  \n",
       "78497    ID_8881b1c4b1  ID_84296c3845  \n",
       "95072    ID_8881b1c4b1  ID_84296c3845  \n",
       "95086    ID_8881b1c4b1  ID_84296c3845  \n",
       "116802   ID_8881b1c4b1  ID_84296c3845  \n",
       "168054   ID_8881b1c4b1  ID_84296c3845  \n",
       "210093   ID_8881b1c4b1  ID_84296c3845  \n",
       "245270   ID_8881b1c4b1  ID_84296c3845  \n",
       "277098   ID_8881b1c4b1  ID_84296c3845  \n",
       "347634   ID_8881b1c4b1  ID_84296c3845  \n",
       "369043   ID_8881b1c4b1  ID_84296c3845  \n",
       "377907   ID_8881b1c4b1  ID_84296c3845  \n",
       "394664   ID_8881b1c4b1  ID_84296c3845  \n",
       "423116   ID_8881b1c4b1  ID_84296c3845  \n",
       "432554   ID_8881b1c4b1  ID_84296c3845  \n",
       "436494   ID_8881b1c4b1  ID_84296c3845  \n",
       "467763   ID_8881b1c4b1  ID_84296c3845  \n",
       "480805   ID_8881b1c4b1  ID_84296c3845  \n",
       "501610   ID_8881b1c4b1  ID_84296c3845  \n",
       "512286   ID_8881b1c4b1  ID_84296c3845  \n",
       "514805   ID_8881b1c4b1  ID_84296c3845  \n",
       "521980   ID_8881b1c4b1  ID_84296c3845  \n",
       "550282   ID_8881b1c4b1  ID_84296c3845  \n",
       "632620   ID_8881b1c4b1  ID_84296c3845  \n",
       "639962   ID_8881b1c4b1  ID_84296c3845  \n",
       "642046   ID_8881b1c4b1  ID_84296c3845  \n",
       "659382   ID_8881b1c4b1  ID_84296c3845  \n",
       "671395   ID_8881b1c4b1  ID_84296c3845  \n",
       "717566   ID_8881b1c4b1  ID_84296c3845  \n",
       "731883   ID_8881b1c4b1  ID_84296c3845  \n",
       "740708   ID_8881b1c4b1  ID_84296c3845  \n",
       "747831   ID_8881b1c4b1  ID_84296c3845  "
      ]
     },
     "execution_count": 105,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train_new[train_new.PatientID == 'ID_d278c67b']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 95,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True     733865\n",
       "False     18938\n",
       "Name: PatientID, dtype: int64"
      ]
     },
     "execution_count": 95,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "train_new.PatientID.duplicated().value_counts()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}